Part A

Project Details

Research Questions

Part B

Variable Information

Names

HDI_rank_2019
Country
HDI_Value
Life_expectancy
Expected_years_of_schooling
Mean_years_of_schooling
GNI_per_capita
GNI_rank_minus_HDI_rank
HDI_rank_2018
Degree_of_Human_Development

Description

  • HDI_rank_2019: A composite index measuring average achievement in three basic dimensions of human development; a long and healthy life, knowledge and a decent standard of living.

  • Country : List of countries for which HDI statistics are calculated.

  • HDI_Value: Summary measure of average achievement in key dimensions of human development: a long and healthy life, being knowledgeable and have a decent standard of living.

  • Life_expectancy_at_birth: Number of years a new-born infant could expect to live if prevailing patterns of age-specific mortality rates at the time of birth stay the same throughout the infant’s life.

  • Expected_years_of_schooling: No. of years of schooling that a child of school entrance age can expect to receive if prevailing patterns of age-specific enrollment rates persist throughout the child’s life.

  • Mean_years_of_schooling: Average number of years of education received by people ages 25 and older, converted from education attainment levels using official durations of each level.

  • GNI_per_capita: This is ’Gross National Income’ per capita. This is the aggregate income of an economy generated by its production and its ownership of factors of production, less the income paid for the use of factors of production owned by the rest of the world, converted to international dollars using PPP(Purchasing Power Parity) rates, divided by midyear population.

  • GNI_per_capita_rank_minus_HDI_rank: Difference in ranking by GNI per capita and by HDI value. A negative value means that the country is better ranked by GNI than by HDI value.

  • HDI_rank_2018: Ranking by HDI value for 2018, calculated using the same most recently revised data available in 2020 that were used to calculate HDI values for 2019.

  • Degree_of_Human_Development : The cutoff-points are HDI of less than 0.550 for low human development, 0.550–0.699 for medium human development, 0.700–0.799 for high human development and 0.800 or greater for very high human development.

Question 1

Q1. To compute the summary statistics for each variable for every Human Development category.

Column

Fig 1.1: The approximate number of country’s in each degree of human development.

Fig 1.2: Boxplot that summarizes the statistic descriptive-information for each variable and degree of human development.

Column

Table 1.1: Descriptive analysis for each variable based on Degree of Human Development.

Variable Degree_of_Human_Development Minimum Median Mean Maximum SD
Expected_years_of_schooling VERY HIGH HUMAN DEVELOPMENT 12.04 16.12 16.14 21.95 1.82
Expected_years_of_schooling HIGH HUMAN DEVELOPMENT 11.19 13.61 13.60 16.87 1.14
Expected_years_of_schooling MEDIUM HUMAN DEVELOPMENT 8.28 11.60 11.50 13.72 1.13
Expected_years_of_schooling LOW HUMAN DEVELOPMENT 5.01 9.70 9.30 12.66 1.87
GNI_per_capita VERY HIGH HUMAN DEVELOPMENT 14428.80 39870.68 42929.79 131031.59 20854.39
GNI_per_capita HIGH HUMAN DEVELOPMENT 5039.04 13009.07 13184.34 26903.25 4763.56
GNI_per_capita MEDIUM HUMAN DEVELOPMENT 2253.35 4960.53 5694.22 13944.13 2682.41
GNI_per_capita LOW HUMAN DEVELOPMENT 753.91 2132.96 2385.03 5689.35 1284.51
HDI_Value VERY HIGH HUMAN DEVELOPMENT 0.80 0.88 0.88 0.96 0.05
HDI_Value HIGH HUMAN DEVELOPMENT 0.70 0.74 0.75 0.80 0.03
HDI_Value MEDIUM HUMAN DEVELOPMENT 0.55 0.61 0.62 0.70 0.04
HDI_Value LOW HUMAN DEVELOPMENT 0.39 0.48 0.49 0.55 0.05
Life_expectancy VERY HIGH HUMAN DEVELOPMENT 72.58 80.21 79.45 84.86 3.29
Life_expectancy HIGH HUMAN DEVELOPMENT 64.13 74.25 74.00 78.93 3.27
Life_expectancy MEDIUM HUMAN DEVELOPMENT 58.74 69.66 68.43 76.68 4.72
Life_expectancy LOW HUMAN DEVELOPMENT 53.28 62.05 61.95 69.02 4.37
Mean_years_of_schooling VERY HIGH HUMAN DEVELOPMENT 7.28 12.14 11.62 14.15 1.47
Mean_years_of_schooling HIGH HUMAN DEVELOPMENT 7.02 9.39 9.47 11.81 1.26
Mean_years_of_schooling MEDIUM HUMAN DEVELOPMENT 4.07 6.50 6.51 11.10 1.52
Mean_years_of_schooling LOW HUMAN DEVELOPMENT 1.64 3.93 4.25 6.76 1.37

Part A

Q2. To compare and contrast the GNI columns for very high and medium development countries and to conduct an analysis to assess whether the very high development countries translate their income better than the medium development countries in the areas of human development.

Column

Fig 2.1: Understanding the distribution of variables.

Fig 2.2: Log-transformation of GNI per capita for further analysis.

Column

Table 2.1: GNI Summary Statistics.

Degree_of_Human_Development min q1 median q3 max mean sd n
MEDIUM HUMAN DEVELOPMENT 7.7 8.3 8.5 8.9 9.5 8.6 0.4 37
VERY HIGH HUMAN DEVELOPMENT 9.6 10.2 10.6 10.9 11.8 10.6 0.5 66

Part B

Column

Fig 2.3: No. of Countries in each degree of HDI V/S GNI per capita (log) plot.

Fig 2.4: Comparing the regression.

Column

Table 2.2: Values for very high HDI group.

  Log of GNI per capita
Predictors Estimates CI p
(Intercept) 4.67 3.26 – 6.07 <0.001
HDI_Value 6.71 5.11 – 8.30 <0.001
Observations 66
R2 / R2 adjusted 0.525 / 0.517

Table 2.3: Values for medium HDI group.

  Log of GNI per capita
Predictors Estimates CI p
(Intercept) 4.84 3.02 – 6.65 <0.001
HDI_Value 6.01 3.08 – 8.94 <0.001
Observations 37
R2 / R2 adjusted 0.331 / 0.312

Column

There is a slop of 6.71 for Very High HDI Group and slop of 6.01 for Medium HDI Group.

Part C

Column

Fig 2.5: Diagnostic Plots for Linear Models.

Analysis for Q2 Part C.

  • In Fig 2.5, Left side graphs represents ‘Very High HDI Group’ and Right side graphs represents ‘Medium HDI Group’.

  • From the diagnostic plots, we see both model are normal distributions and are a good fit to the data.

  • However, from table 2.4 and table 2.5, we see that both model shows a low coefficient of determination. This suggests that GNI is not the only fundamental variables that determines the HDI value.

  • Therefore, we can conclude that although the very high HDI group performs better in both GNI and HDI than medium HDI group, there is only a medium correlation between GNI and HDI values to show that very high HDI translates income better.

Column

Table 2.4: Very High HDI Group Linear Model Values.

r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
0.5247895 0.5173643 0.3244632 70.67715 0 1 -18.34599 42.69197 49.26094 6.737687 64 66

Column

Table 2.5: Medium Group HDI Group Linear Model Values.

r.squared adj.r.squared sigma statistic p.value df logLik AIC BIC deviance df.residual nobs
0.3311233 0.3120126 0.3627941 17.32654 0.0001946 1 -13.95765 33.91531 38.74806 4.606685 35 37

Part A

Column

Q3. Calculate the gap between the ‘expected years of education’ and the ‘average years of education’ for countries with different levels of Human Development. Let this gap be called : Residual years of education.

We analyze if countries with high levels of human civilization are estimated to have high levels of expected education. Therefore, judge if it is inaccurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.

Column

Fig:3.1 Residual years of education for High HDI group (Green) and V.High HDI group(Red).

Fig:3.2 Residual years of education for Low HDI group (Light Green) and Medium HDI group(Orange).

Column

Analysis for Q3 Part A.

  • Fig 3.1 and 3.2 show that the residual years of education in some countries with high human development is larger than countries with low human development (Kovacevic, 2010).

Part B

Column

Table: 3.1 Top 12 Country’s with maximum count of residual years of schooling for different Degree of Human Development.

Country Degree_of_Human_Development residual_years_of_schooling
Australia VERY HIGH HUMAN DEVELOPMENT 9.229639
Bhutan MEDIUM HUMAN DEVELOPMENT 8.908858
Benin LOW HUMAN DEVELOPMENT 8.788222
Turkey VERY HIGH HUMAN DEVELOPMENT 8.495846
Morocco MEDIUM HUMAN DEVELOPMENT 8.073170
Uruguay VERY HIGH HUMAN DEVELOPMENT 7.909910
Tunisia HIGH HUMAN DEVELOPMENT 7.905390
Grenada HIGH HUMAN DEVELOPMENT 7.837766
Timor-Leste MEDIUM HUMAN DEVELOPMENT 7.821715
Burundi LOW HUMAN DEVELOPMENT 7.781347
Nepal MEDIUM HUMAN DEVELOPMENT 7.742130
Belgium VERY HIGH HUMAN DEVELOPMENT 7.724110

Column

Table 3.2: Summary of residual years of schooling (RYS) for different Degree of Human Development

Degree_of_Human_Development Minimum RYS Maximum RYS Median RYS
VERY HIGH HUMAN DEVELOPMENT 1.462530 9.229639 4.112699
HIGH HUMAN DEVELOPMENT -0.174090 7.905390 4.157615
MEDIUM HUMAN DEVELOPMENT 0.928100 8.908858 4.987901
LOW HUMAN DEVELOPMENT 0.496258 8.788222 5.108076

Column

Analysis for Q3 Part B.

  • Table 3.2 contradicts table 3.1.

Part C

Column

Fig: 3.3 Count of Residual years V/S Country with respect to different HDI Category’s.

Fig: 3.4 Count of Residual years V/S Country with respect to different HDI Category’s.

Column

Analyisis for Q3 Part C.

From fig 3.3 and fig 3.4, we observe that because there are more countries in ‘high HDI group’ than ‘low HDI group’, and only a small part of ‘high HDI group’ has excessive differences.

Part A

Q4.Compare and contrast HDI ranks for 2018 and 2019 i.e. compare ranks for countries and examine their decline or increase in HDI rank from 2018 to 2019.

Column

Table 4.1: Count of countries with rank difference.

No. of countries with rank difference
112

Column

Fig: 4.1 Rank difference v/s Countries

Analysis for Q4 Part A.

  • There are more countries that have experienced a decline in rank from 2018 to 2019 (negative values on graph), than the countries whose rank has gone up.

  • Higher ‘negative values’ are observed for ‘high HDI group’ i.e. more decline of rank in High HDI group countries.

  • Maximum increase is observed among ‘high HDI group’ as well i.e out of countries with an increase in rank, maximum countries are from High HDI group.

Part B

Column

Table 4.2: No. of countries with same rank

No. of countries with the same rank
77

Column

Fig 4.2: Top 5 ranking countries (rank maintained form 2018 to 2019) from each HDI Category.

Analysis for Q4 Part B.

  • The Lowest rank is for Norway for both the years, which belongs to very high HDI group.

Part A

Column

Part B

References

[1]Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686

[2] Hadley Wickham and Jim Hester (2020). readr: Read Rectangular Text Data. R package version 1.4.0. https://CRAN.R-project.org/package=readr

[3] Hadley Wickham and Evan Miller (2020). haven: Import and Export ‘SPSS’, ‘Stata’ and ‘SAS’ Files. R package version 2.3.1. https://CRAN.R-project.org/package=haven

[4] Hao Zhu (2021). kableExtra: Construct Complex Table with ‘kable’ and Pipe Syntax. R package version 1.3.4. https://CRAN.R-project.org/package=kableExtra

[5] R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/.

[6] Frank E Harrell Jr, with contributions from Charles Dupont and many others. (2021). Hmisc: Harrell Miscellaneous. R package version 4.5-0. https://CRAN.R-project.org/package=Hmisc

[7] Baptiste Auguie (2017). gridExtra: Miscellaneous Functions for “Grid” Graphics. R package version 2.3. https://CRAN.R-project.org/package=gridExtra

[8] Katherine Goode and Kathleen Rey (2019). ggResidpanel: Panels and Interactive Versions of Diagnostic Plots using ‘ggplot2’. R package version 0.3.0. https://CRAN.R-project.org/package=ggResidpanel

[9] Claus O. Wilke (2020). cowplot: Streamlined Plot Theme and Plot Annotations for ‘ggplot2’. R package version 1.1.1. https://CRAN.R-project.org/package=cowplot

[10] Lüdecke D (2021). sjPlot: Data Visualization for Statistics in Social Science. R package version 2.8.7, <URL: https://CRAN.R-project.org/package=sjPlot>.

[11] Silge J, Robinson D (2016). “tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” JOSS, 1(3). doi: 10.21105/joss.00037 (URL: https://doi.org/10.21105/joss.00037), <URL: http://dx.doi.org/10.21105/joss.00037>.

[12] Kovacevic, M. (2010). Review of HDI Critiques and Potential Improvements. Human Development Research Paper, 2010/33. Retrieved 24 May 2021, from https://www.researchgate.net/publication/235945302_Review_of_HDI_Critiques_and_Potential_Improvements_Human_Development_Research_Paper_201033

[13] Human Development Index (HDI) | Human Development Reports. (2021). Retrieved 24 May 2021, from http://hdr.undp.org/en/content/human-development-index-hdi

Part C

Row

Column

Credits :

  • Xiaoyu Tian

  • Nishtha Arora

  • Shaohu Chen

  • Nurlaily Furqandari Suliana

---
title: "Analysis on Human Development Index 2019"
author: "T12_Fri_kable"
output: 
  flexdashboard::flex_dashboard:
    orientation: rows
    vertical_layout: fill
    source_code: embed
---


```{r Global, message = FALSE, warning= FALSE, echo=FALSE}
knitr::opts_chunk$set(fig.width=15, fig.height=10, fig.align = "center") 
```


```{r LoadingLibraries, message=FALSE, warning=FALSE, echo=FALSE}
library(tidyverse)
library(readr)
library(haven)
library(kableExtra)
library(grid)
library(Hmisc)
library(gridExtra)
library(ggResidpanel)
library(cowplot)
library(sjPlot)
library(flexdashboard)
library(tidytext)
library(ggplot2)
library(plotly)
library(countrycode)
```

Part A {data-navmenu="Introduction"}
===================================== 


+ The Human Development Index (HDI) is a **statistic composite index of life expectancy, education (mean years of schooling completed and expected years of schooling upon entering the education system), and per capita income indicators, which are used to rank countries into four tiers of human development.**

+ It emphasizes on capabilities of people and makes it as a criteria for **assessing the development of a country**. The data used here, is extracted from [United Nations Development Programme : Human Development Reports](http://hdr.undp.org/en/composite/HDI). The data set show indicative **data for countries with very high, high, moderate and low human development**. 


**Project Details**

+ Name: **Analysis on Human Development Index 2019** (Group Project - ETC5510)

+ Objective: To analyze the data set used, by answering **four research questions.**

+ It compares the countries for HDI value, HDI rank(2019, 2018)and SDGs (Sustainable Development Goals) 3,4,5 (2019), where SDG 3 = Life expectancy at birth, SDG 4.3 = Expected years of schooling, SDG 4.4 = Mean years of schooling, SDG 8.5 = Gross national income (GNI) per capita.


**Research Questions**

+ To compute the **summary statistics** for each variable for every Human Development category.

+ To compare and contrast the **'GNI columns' for 'very high' and 'medium development countries'** and to conduct an analysis to assess **whether the very high development countries' translate their income better than the 'medium development countries'** in the areas of human development.

+ Calculate the gap between the ‘expected years of education’ and the ‘average years of education’ for countries with different levels of Human Development, to analyze **if countries with high levels of human civilization are estimated to have high levels of expected education**. Therefore, judge **if it is inaccurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.**

+ Compare and contrast HDI ranks for 2018 and 2019 i.e. compare ranks for countries and **examine their decline or increase in HDI rank from 2018 to 2019.**

###
```{r img1, echo = F, out.width = '3%'}
knitr::include_graphics("images/us.jpg")
```

###
```{r img5, echo = F, out.width = '3%'}
knitr::include_graphics("images/ind.webp")
```

###
```{r img6, echo = F, out.width = '3%'}
knitr::include_graphics("images/money.jpg")
```


Part B {data-navmenu="Introduction"}
===================================== 
**Variable Information**

```{r ReadingTidyData, message=FALSE, warning=FALSE, echo=FALSE}
HDI <- read_csv("data/HDIData.csv") %>% rename(Degree_of_Human_Development = y)
```


### **Names**

```{r, echo= FALSE, warning= FALSE, message = FALSE, fig.width= 3}
summary <- colnames(HDI)

knitr::kable(summary, col.names = gsub("[.]", " ", names(summary))) 
```


### **Description**

- **HDI_rank_2019**: A **composite index measuring average achievement** in three basic dimensions of human development; a long and healthy life, knowledge and a decent standard of living.

- Country : List of countries for which HDI statistics are calculated.

- **HDI_Value**: **Summary measure of average achievement** in key dimensions of human development: a long and healthy life, being knowledgeable and have a decent standard of living.

- **Life_expectancy_at_birth**: Number of years a new-born infant could **expect to live** if prevailing patterns of age-specific mortality rates at the time of birth stay the same throughout the infant’s life.

- **Expected_years_of_schooling**: **No. of years of schooling** that a child of school entrance age **can expect to receive** if prevailing patterns of age-specific enrollment rates persist throughout the child’s life. 

- **Mean_years_of_schooling**: **Average number of years of education received** by people ages 25 and older, converted from education attainment levels using official durations of each level.

- **GNI_per_capita**: This is '**Gross National Income' per capita**. This is the **aggregate income of an economy** generated by its production and its ownership of factors of production, less the income paid for the use of factors of production owned by the rest of the world, converted to international dollars using PPP(Purchasing Power Parity) rates, divided by midyear population.

- GNI_per_capita_rank_minus_HDI_rank: Difference in ranking by GNI per capita and by HDI value. A negative value means that the country is better ranked by GNI than by HDI value.

- HDI_rank_2018: Ranking by HDI value for 2018, calculated using the same most recently revised data available in 2020 that were used to calculate HDI values for 2019.

- **Degree_of_Human_Development** : **The cutoff-points are HDI of less than 0.550 for low human development, 0.550–0.699 for medium human development, 0.700–0.799 for high human development and 0.800 or greater for very high human development.**


Question 1
===================================== 
Q1. **To compute the summary statistics for each variable for every Human Development category.**
 

Column {data-width=300}
---------------------------------------------------


### Fig 1.1:  The approximate **number of country's** in each degree of human development.

```{r Fig1, fig.height= 4, fig.width=7, aes = FALSE, message=FALSE, warning=FALSE, echo=FALSE}

plot1 <- ggplot(HDI, aes(x = Degree_of_Human_Development,
                fill = Degree_of_Human_Development)) + 
  geom_bar() +
  scale_x_discrete(labels = NULL)+
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank(),
  (plot.title = element_text(hjust = 0.5)))+
  theme_classic() 
ggplotly(plot1)
```


### Fig 1.2: Boxplot that summarizes the **statistic descriptive-information** for each variable and degree of human development. 

```{r Fig2, fig.height= 3, fig.width=9, echo= FALSE, warning=FALSE, message=FALSE}
HDI_Valueplot <- ggplot(HDI,
                        aes(y = HDI_Value,
                            fill = "HDI_Value")) +
   labs(x = 'HDI_Value', y = '') +
  geom_boxplot(show.legend = FALSE) +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank()) 
 Life_expectancyplot <- ggplot(HDI,
                        aes(y = Life_expectancy,
                          fill = "Life_expectancy")) +
  labs(x = 'Life Expectancy', y = '') +
  geom_boxplot(show.legend = FALSE) +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank()) 
Expected_years_of_schoolingplot <- ggplot(HDI,
                        aes(y = Expected_years_of_schooling,
                           fill = "Expected_years_of_schooling")) +
  labs(x = 'Expected Year of Schooling', y = '') +
  geom_boxplot(show.legend = FALSE) +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank()) 
Mean_years_of_schoolingplot <- ggplot(HDI,
                        aes(y = Mean_years_of_schooling,
                           fill = "Mean_years_of_schooling")) +
  labs(x = 'Mean Year of Schooling', y = '') +
  geom_boxplot(show.legend = FALSE) +
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank()) 
GNI_per_capitaplot <- ggplot(HDI,
                        aes(y = GNI_per_capita,
                           fill = "GNI_per_capita")) +
  labs(x = 'GNI per Capita', y = '') +
  geom_boxplot(show.legend = FALSE) +
  scale_y_continuous(labels = scales::comma)+
  theme(axis.text.x = element_blank(),
        axis.ticks.x = element_blank()) 
grid.arrange(HDI_Valueplot, 
             Expected_years_of_schoolingplot, 
             Life_expectancyplot, 
             Mean_years_of_schoolingplot, 
             GNI_per_capitaplot, 
             nrow = 1)
```

Column {data-height=300}
-------------------------------------
### Table 1.1: **Descriptive analysis** for each variable based on Degree of Human Development.

```{r Tab1,  message = FALSE, warning= FALSE, echo=FALSE}
HDI_long <- pivot_longer(HDI, c(3:7), names_to = "Variable") %>%
  group_by(Variable, Degree_of_Human_Development) %>%
  summarise(Minimum = round(min(value), digits = 2),
            Median = round(median(value), digits = 2),
            Mean = round(mean(value), digits = 2),
            Maximum = round(max(value), digits = 2),
            SD = round(sd(value), digits = 2)) %>%
  arrange(Variable, -Maximum)
knitr::kable(HDI_long,
              booktabs = TRUE) %>%
   kable_styling(full_width = TRUE, bootstrap_options = "bordered") %>% 
  kable_classic() 

```


Part A {data-navmenu="Question 2"}
===================================== 
Q2. To compare and contrast the **GNI columns** for **very high and medium development countries** and to conduct an analysis to assess whether the **very high development countries translate their income better than the medium development countries** in the areas of human development.

Column {data-width=400}
-------------------------------------

```{r Q2Filter, message = FALSE, warning= FALSE, echo=FALSE, }
hdi_r <- HDI %>% 
  select(Country, HDI_rank_2019, HDI_Value, GNI_per_capita, GNI_rank_minus_HDI_rank, Degree_of_Human_Development) %>% 
  filter(Degree_of_Human_Development %in% c("VERY HIGH HUMAN DEVELOPMENT", "MEDIUM HUMAN DEVELOPMENT")) %>%
  mutate(gni_rank_2019 = HDI_rank_2019 + GNI_rank_minus_HDI_rank)
```

### Fig 2.1: Understanding the **distribution of variables**.

```{r Fig3, warning=FALSE, message=FALSE, echo=FALSE, fig.height= 4, fig.width=7}
plot2 <- hdi_r %>% ggplot(aes(x = GNI_per_capita, fill = Degree_of_Human_Development)) + 
  geom_histogram(bins = 29, alpha = 0.4) +
  scale_x_continuous(labels = scales::comma)+
  scale_fill_manual(values=c("#E69F00", "#56B4E9"))+
  theme_classic()
ggplotly(plot2)
```


### Fig 2.2: **Log-transformation of GNI per capita** for further analysis.
```{r Fig4, warning=FALSE, message=FALSE, echo=FALSE, fig.height= 4, fig.width=7}
lnhdi_r <- hdi_r %>%
  mutate(`Log of GNI per capita` = log(GNI_per_capita))
plot3 <- lnhdi_r %>% ggplot(aes(x = `Log of GNI per capita`, fill = Degree_of_Human_Development))+ 
  geom_histogram(bins = 29, alpha = 0.4) +
  scale_fill_manual(values=c("#E69F00", "#56B4E9"))+
  theme_classic()
ggplotly(plot3)
```

Column {data-height=250}
-------------------------------------
### Table 2.1: **GNI Summary** Statistics.

```{r Tab2, message = FALSE, warning= FALSE, echo=FALSE}
lnhdi_r %>% 
  group_by(Degree_of_Human_Development) %>%
  summarise(min = min(`Log of GNI per capita`, na.rm=TRUE),
            q1 = quantile(`Log of GNI per capita`, 0.25, na.rm=TRUE),
            median = median(`Log of GNI per capita`, na.rm=TRUE),
            q3 = quantile(`Log of GNI per capita`, 0.75, na.rm=TRUE),
            max = max(`Log of GNI per capita`, na.rm=TRUE),
            mean = mean(`Log of GNI per capita`, na.rm=TRUE), 
            sd = sd(`Log of GNI per capita`, na.rm=TRUE),
            n = n()) %>% 
  kbl(digits = 1)%>% 
kable_material("hover", full_width = T)

```

Part B {data-navmenu="Question 2"}
===================================== 

Column {data-height=400}
-------------------------------------

### Fig 2.3: No. of Countries in each degree of **HDI V/S GNI per capita (log)** plot.

```{r Fig5, warning=FALSE, message=FALSE, echo=FALSE, fig.height= 5, fig.width=5}
a1 <- lnhdi_r %>% 
  ggplot(aes(x = Degree_of_Human_Development , y= (`Log of GNI per capita`))) +
    geom_violin(draw_quantiles = c(0.25, 0.5, 0.75), 
                fill = "lemonchiffon1") +
  ylab("GNI per capita") +
  xlab("") +
  theme_classic()
ggplotly(a1)
```


### Fig 2.4: Comparing the **regression**.

```{r Fig6, message = FALSE, warning= FALSE, echo=FALSE, fig.height= 5, fig.width=7}
plot4 <- lnhdi_r %>%
  group_by(Degree_of_Human_Development) %>%
  ggplot(aes(x = `Log of GNI per capita`, y = HDI_Value, colour = Degree_of_Human_Development)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
   scale_shape_manual(values=c(3, 16, 17))+ 
  scale_color_manual(values=c("Red","Blue"))+
  labs(x = "Log of GNI per capita", y = "HDI value") +
  theme(legend.position="top")+ 
  theme_classic()+
  labs(color = "HDI group")
ggplotly(plot4)
  
```

Column {data-height=200}
-------------------------------------

### Table 2.2: Values for **very high HDI group**.

```{r, message = FALSE, warning= FALSE, echo=FALSE}
vhigh <- lnhdi_r %>% filter(Degree_of_Human_Development == "VERY HIGH HUMAN DEVELOPMENT")
vmod <- lm(`Log of GNI per capita` ~ HDI_Value, data = vhigh)
tab_model(vmod)
```

### Table 2.3: Values for **medium HDI group**.
```{r Tab3, message = FALSE, warning= FALSE, echo=FALSE}
med <- lnhdi_r %>% filter(Degree_of_Human_Development == "MEDIUM HUMAN DEVELOPMENT")
mmod <- lm(`Log of GNI per capita` ~ HDI_Value, data = med)
tab_model(mmod)

```

Column {data-height=50}
-------------------------------------

There is a **slop of 6.71** for Very High HDI Group and **slop of 6.01** for Medium HDI Group.



Part C {data-navmenu="Question 2"}
===================================== 

Column {data-height=300}
-------------------------------------

### Fig 2.5: Diagnostic **Plots for Linear Models**.

```{r Fig7, message = FALSE, warning= FALSE, echo=FALSE, fig.width= 8, fig.height= 5}
resid_compare(models = list(vmod,mmod
                            ),
              plots = c("resid", "qq", "hist"),
              smoother = TRUE,
              qqbands = TRUE,
              title.opt = TRUE)+
  theme_classic() 
```


### Analysis for Q2 Part C.

+ In Fig 2.5, **Left side graphs** represents 'Very High HDI Group' and **Right side graphs** represents 'Medium HDI Group'.

+ From the diagnostic plots, we see **both model are normal distributions** and are a **good fit** to the data. 

+ However, from table 2.4 and table 2.5, we see that **both model shows a low coefficient of determination**. This suggests that **GNI is not the only fundamental variables that determines the HDI value.** 

+ Therefore, we can conclude that although the very high HDI group performs better in both GNI and HDI than medium HDI group, there is only a **medium correlation between GNI and HDI values to show that very high HDI translates income better.**


Column {data-height=100}
-------------------------------------

### Table 2.4: **Very High HDI Group** Linear Model Values.
```{r Tab4, message = FALSE, warning= FALSE, echo=FALSE}
i <- broom::glance(vmod)
j <- broom::glance(mmod)

knitr::kable(i, align = "c") %>% 
  kable_material(c("striped", "hover"))
```

Column {data-height=100}
-------------------------------------

### Table 2.5: **Medium Group HDI Group** Linear Model Values.
```{r Tab9, message = FALSE, warning= FALSE, echo=FALSE}
j <- broom::glance(mmod)
knitr::kable(j, align = "c") %>% 
  kable_material(c("striped", "hover"))

```


Part A {data-navmenu="Question 3"}
===================================== 

Column {data-height=100}
---------------------------------------------------

### Q3. Calculate the **gap** between the ‘expected years of education’ and the ‘average years of education’ for countries with different levels of Human Development. Let this gap be called : **Residual years of education**.

We analyze if countries with high levels of human civilization are estimated to have high levels of expected education. Therefore, judge **if it is inaccurate to use the ‘average value of expected years of education and average years of education’ for the calculation of HDI.**

Column {data-height=500}
---------------------------------------------------

```{r Q3Filter, message = FALSE, warning= FALSE, echo=FALSE}
HDI_residual <- HDI %>% 
  mutate(residual_years_of_schooling = Expected_years_of_schooling - Mean_years_of_schooling) %>% arrange(desc(residual_years_of_schooling))
HDI_residual_col <- HDI_residual %>% mutate(Degree_of_Human_Development = as.factor(Degree_of_Human_Development), new_col = reorder_within(Country, residual_years_of_schooling,Degree_of_Human_Development))
```

### Fig:3.1 **Residual years** of education for **High HDI group (Green)** and **V.High HDI group(Red).**
```{r Fig8, message = FALSE, warning= FALSE, echo=FALSE, fig.height= 17, fig.width=16 }
a <- HDI_residual_col %>% 
  filter(Degree_of_Human_Development %in% c("VERY HIGH HUMAN DEVELOPMENT", "HIGH HUMAN DEVELOPMENT"))
ggplot(a,
       aes(x = residual_years_of_schooling,
           y = new_col, fill= Degree_of_Human_Development), show.legend = FALSE) + 
  geom_col(cex = 5)+
    scale_y_reordered() + 
  facet_wrap(~Degree_of_Human_Development, scales = "free") +
  labs( y = "Countries names",
        x = "Residual years of education"
         ) +
  theme_classic(base_size = 15)+
  theme(legend.position = "none")+
  theme(axis.text = element_text(size = 15))+
  scale_fill_brewer(palette = "Dark2")
```

### Fig:3.2 **Residual years** of education for **Low HDI group (Light Green)** and **Medium HDI group(Orange).**
```{r Fig, message = FALSE, warning= FALSE, echo=FALSE, fig.height= 17, fig.width=16 }
b <- HDI_residual_col %>% 
filter(Degree_of_Human_Development %in% c("MEDIUM HUMAN DEVELOPMENT", "LOW HUMAN DEVELOPMENT"))
ggplot(b,
       aes(x = residual_years_of_schooling,
           y = new_col, fill= Degree_of_Human_Development), show.legend = FALSE) + 
  geom_col(cex = 5)+
    scale_y_reordered() + 
  facet_wrap(~Degree_of_Human_Development, scales = "free") +
  labs( y = "Countries' names",
        x = "Residual years of education"
         ) +
  theme_classic(base_size = 15)+
  theme(legend.position = "none")+
  theme(axis.text = element_text(size = 15))+
  scale_fill_brewer(palette = "Pastel2")
```

Column {data-height=50}
---------------------------------------------------
### Analysis for Q3 Part A.

+ Fig 3.1 and 3.2 show that the residual years of education in **some** countries with high human development is larger than countries with low human development (Kovacevic, 2010).


Part B {data-navmenu="Question 3"}
===================================== 

Column {data-height=400}
-------------------------------------

### Table: 3.1 Top 12 Country's with **maximum count of residual years** of schooling for different Degree of Human Development.

```{r Tab5, message = FALSE, warning= FALSE, echo=FALSE, fig.width= 12}
detail_educa_year <- HDI_residual %>% select(Country,Degree_of_Human_Development,residual_years_of_schooling) %>% group_by(Degree_of_Human_Development) %>% arrange(desc(residual_years_of_schooling)) %>% head(12)

knitr::kable(detail_educa_year,
             booktabs = TRUE) %>%
   kable_styling(bootstrap_options  = c("striped", "hold_position")) %>% 
  kable_material() %>% 
  row_spec(1, bold = T, color = "Black", background = "Red") %>% 
row_spec(2, bold = T, color = "Black", background = "Gray") %>% 
row_spec(3, bold = T, color = "Black", background = "Yellow") %>% 
row_spec(7, bold = T, color = "Black", background = "orange")
```

Column {data-height=300}
-------------------------------------

### Table 3.2: **Summary of residual years of schooling** (RYS) for different Degree of Human Development

```{r Tab6, message = FALSE, warning= FALSE, echo=FALSE, fig.width=8}
summary_educa_year <-HDI_residual %>% 
  group_by(Degree_of_Human_Development) %>% 
  summarise(`Minimum RYS` = min(residual_years_of_schooling, na.rm = TRUE), `Maximum RYS`= max(residual_years_of_schooling, na.rm = TRUE),
            `Median RYS` = median(residual_years_of_schooling, na.rm = TRUE)) %>% arrange(`Median RYS`) 

knitr::kable(summary_educa_year,
             booktabs = TRUE) %>%
   kable_styling(bootstrap_options  = c("striped", "hold_position")) %>% 
  kable_material()
```
Column {data-height=50}
---------------------------------------------------
### Analysis for Q3 Part B.

+ Table 3.2 contradicts table 3.1.

Part C {data-navmenu="Question 3"}
===================================== 

Column {data-height=400}
-------------------------------------

### Fig: 3.3 **Count of Residual years V/S Country** with respect to different HDI Category's.
```{r Fig9, fig.height= 12, fig.width=20, message = FALSE, warning= FALSE, echo=FALSE}
fig1<- HDI_residual %>% 
  mutate(residual_years_of_schooling_log = log10(residual_years_of_schooling+20)) %>% 
  mutate(Degree_of_Human_Development = fct_reorder(as_factor(Degree_of_Human_Development),
                                  residual_years_of_schooling_log,
                                  median, na.rm= TRUE)) %>% 
  ggplot(aes(x=Degree_of_Human_Development, y = residual_years_of_schooling_log, fill= Degree_of_Human_Development)) + 
  geom_point(alpha = 20) + 
geom_jitter(position=position_jitter(0.2)) + 
  stat_summary(fun.data="mean_sdl",  fun.args = list(mult=1), 
               geom="pointrange")+
  xlab("Degree of Human Development") + 
  ylab("residual years of schooling (log)") +
  theme_bw() +
   scale_x_discrete(labels = NULL)+
  theme(axis.text = element_text(size = 7))
ggplotly(fig1)
```


### Fig: 3.4 **Count of Residual years V/S Country** with respect to different HDI Category's.

```{r Fig12, fig.height= 12, fig.width=20, message = FALSE, warning= FALSE, echo=FALSE}
fig2<-HDI_residual %>% 
  mutate(residual_years_of_schooling_log = log10(residual_years_of_schooling+20)) %>% 
  mutate(Degree_of_Human_Development = fct_reorder(as_factor(Degree_of_Human_Development),
                                  residual_years_of_schooling_log,
                                  median, na.rm= TRUE)) %>% 
  ggplot(aes(x=Degree_of_Human_Development, y = residual_years_of_schooling_log,
             fill= Degree_of_Human_Development)) + 
  geom_point(alpha = 0.2) + 
  geom_violin(draw_quantiles = c(0.1, 0.25, 0.5)) + 
   scale_x_discrete(labels = NULL)+
  xlab("Degree of Human Development") + 
  ylab("residual years of schooling (log)") +
  theme_classic()
ggplotly(fig2)
```

Column {data-height=50}
-------------------------------------

### Analyisis for Q3 Part C.

From fig 3.3 and fig 3.4, we observe that because there are more countries in 'high HDI group' than 'low HDI group', **and only a small part of 'high HDI group' has excessive differences.**


Part A {data-navmenu="Question 4"}
===================================== 

Q4.Compare and contrast HDI ranks for 2018 and 2019 i.e. compare ranks for countries and **examine their decline or increase in HDI rank** from 2018 to 2019.

Column {data-height=150}
---------------------------------------------------

```{r SelectingforQ4, message=FALSE, warning=FALSE, echo=FALSE}
Q4 <- HDI %>% 
  select(HDI_rank_2019, Country,HDI_rank_2018, Degree_of_Human_Development) %>% 
  mutate(Difference = if_else(condition = HDI_rank_2019 == HDI_rank_2018,
                              true = "Same",
                              false = "Different")) 
```


```{r Diff, message=FALSE, warning=FALSE, echo=FALSE, }
Diffr <- Q4 %>% 
  dplyr::filter(Difference == "Different") 
Same <- Q4 %>% 
  dplyr::filter(Difference == "Same")
```


### Table 4.1:  Count of countries with rank difference.
```{r Tab7, message=FALSE, warning=FALSE, echo=FALSE, fig.width=5}
Tab1 <- count(Diffr) %>% 
rename(`No. of countries with rank difference` = n )
knitr::kable(Tab1, align = "c") %>% 
  kable_material()
```



Column {data-height=450}
---------------------------------------------------

### Fig: 4.1 **Rank difference v/s Countries**
```{r Fig10, message=FALSE, warning=FALSE, echo=FALSE, fig.width=7, fig.height=5}
Fig1 <- Diffr %>% 
  mutate(IncOrDec = if_else(condition = Diffr$HDI_rank_2019 > Diffr$HDI_rank_2018,
                             true = "Increase",
                             false = "Decline")) %>% 
   mutate(DiffValue = (HDI_rank_2019 - HDI_rank_2018))

plot6 <- ggplot(Fig1 , aes( IncOrDec, DiffValue, fill= Degree_of_Human_Development)) +
  geom_bar(stat = "Identity") +
theme_classic() +
scale_fill_brewer(palette = "Pastel2")+
  theme(axis.title.x = element_blank())
ggplotly(plot6)
```



### Analysis for Q4 Part A.

+ There are **more countries** that have experienced a **decline in rank** from 2018 to 2019 (negative values on graph), than the countries whose rank has gone up.

+ **Higher 'negative values'** are observed for **'high HDI group'** i.e. more decline of rank in High HDI group countries.

+ **Maximum increase** is observed among '**high HDI group**' as well i.e out of countries with an increase in rank, maximum countries are from High HDI group.







Part B {data-navmenu="Question 4"}
===================================== 

Column {data-height=50}
---------------------------------------------------

### Table 4.2:  No. of countries with same rank
```{r Tab8, message=FALSE, warning=FALSE, echo=FALSE, fig.width=8}
Tab2 <- (count(Q4) - count(Diffr)) %>% 
        rename(`No. of countries with the same rank` = n ) 
knitr::kable(Tab2, align = "c") %>% 
  kable_material()
```

```{r}
VH <- Same %>% 
  filter(Degree_of_Human_Development == "VERY HIGH HUMAN DEVELOPMENT")
g <- VH[1:5,]
H <- Same %>% 
  filter(Degree_of_Human_Development == "HIGH HUMAN DEVELOPMENT")
i <- H[1:5,]
M <- Same %>% 
  filter(Degree_of_Human_Development == "MEDIUM HUMAN DEVELOPMENT")
j <- M[1:5,]
L<- Same %>% 
  filter(Degree_of_Human_Development == "LOW HUMAN DEVELOPMENT")
q <- L[1:5,]
k <- full_join(g,i)
s <- full_join(k,j)
scatter <- full_join(s,q)
```

Column {data-height=250}
---------------------------------------------------

### Fig 4.2: **Top 5** ranking countries (**rank maintained** form 2018 to 2019) from **each HDI Category**.

```{r Fig11, warning=FALSE, echo=FALSE, fig.height=9, fig.width=25}
df <- scatter
fig <- df %>%
  plot_ly(
    y = ~Country, 
    x = ~HDI_rank_2019, 
    color = ~Degree_of_Human_Development, 
    frame = ~Degree_of_Human_Development, 
    type = 'scatter',
    mode = 'markers'
  ) 
hide_legend(fig)
fig <- fig %>%
  animation_opts(
    1000, easing = "elastic", redraw = FALSE
  )

```

###
Analysis for Q4 Part B.

+ The Lowest rank is for **Norway** for both the years, which **belongs to very high HDI group.**



Part A {data-navmenu="Conclusion"}
===================================== 

- **Most** countries fall under *Very High Human Development* and **few** under *Low Human Development*. 

- HDI value, life expectancy, and mean year of schooling have more values as compared to 'expected year of schooling' and GNI. Also **outlier data only exist in variable Expected year of schooling and GNI per capita.**

- **Higher income** will result in a **high human development group**, which means a **higher HDI value**. 

- **Very High HDI Group** translates **income better.**

- It is **accurate** to use the **‘average value of expected years of education and average years of education’ for the calculation of HDI.**

- There are **112 countries** that have a **different HDI rank** for 2018 and 2019. Out of which, **more** countries have had a **decline** in rank from 2018 to 2019, than an increase.  

- There are **77 countries** that have the **same rank** for 2018 and 2019, out of which, **Norway has maintained its 1st rank overall.**

Column {data-height=300}
---------------------------------------------------

```{r}
HDI$iso3 <- countrycode(HDI$Country, 'country.name', 'iso3c')
HDI[HDI$Country=="Kosovo","iso3"] <- "XKX"
```


```{r map, fig.width=5, fig.height=5, echo=FALSE, warning=FALSE, message=FALSE, out.width= '60%'}
fig <- plot_ly(HDI, type='choropleth', locations= HDI$iso3 , z= HDI$HDI_rank_2019, text= ~paste(
                      "
Country: ", Country, "
HDI rank 2019:", HDI_rank_2019), colorscale="Viridis") fig <- fig %>% colorbar(title = "HDI Rank" ) fig <- fig %>% layout( title = 'HDI Country ranks 2019') fig ``` ### ```{r img13, echo = F, out.width = '3%'} knitr::include_graphics("images/g.png") ``` Part B {data-navmenu="Conclusion"} ===================================== **References** [1]Wickham et al., (2019). Welcome to the tidyverse. Journal of Open Source Software, 4(43), 1686, https://doi.org/10.21105/joss.01686 [2] Hadley Wickham and Jim Hester (2020). readr: Read Rectangular Text Data. R package version 1.4.0. https://CRAN.R-project.org/package=readr [3] Hadley Wickham and Evan Miller (2020). haven: Import and Export 'SPSS', 'Stata' and 'SAS' Files. R package version 2.3.1. https://CRAN.R-project.org/package=haven [4] Hao Zhu (2021). kableExtra: Construct Complex Table with 'kable' and Pipe Syntax. R package version 1.3.4. https://CRAN.R-project.org/package=kableExtra [5] R Core Team (2021). R: A language and environment for statistical computing. R Foundation for Statistical Computing, Vienna, Austria. URL https://www.R-project.org/. [6] Frank E Harrell Jr, with contributions from Charles Dupont and many others. (2021). Hmisc: Harrell Miscellaneous. R package version 4.5-0. https://CRAN.R-project.org/package=Hmisc [7] Baptiste Auguie (2017). gridExtra: Miscellaneous Functions for "Grid" Graphics. R package version 2.3. https://CRAN.R-project.org/package=gridExtra [8] Katherine Goode and Kathleen Rey (2019). ggResidpanel: Panels and Interactive Versions of Diagnostic Plots using 'ggplot2'. R package version 0.3.0. https://CRAN.R-project.org/package=ggResidpanel [9] Claus O. Wilke (2020). cowplot: Streamlined Plot Theme and Plot Annotations for 'ggplot2'. R package version 1.1.1. https://CRAN.R-project.org/package=cowplot [10] Lüdecke D (2021). _sjPlot: Data Visualization for Statistics in Social Science_. R package version 2.8.7, . [11] Silge J, Robinson D (2016). “tidytext: Text Mining and Analysis Using Tidy Data Principles in R.” _JOSS_, *1*(3). doi: 10.21105/joss.00037 (URL: https://doi.org/10.21105/joss.00037), . [12] Kovacevic, M. (2010). Review of HDI Critiques and Potential Improvements. Human Development Research Paper, 2010/33. Retrieved 24 May 2021, from https://www.researchgate.net/publication/235945302_Review_of_HDI_Critiques_and_Potential_Improvements_Human_Development_Research_Paper_201033 [13] Human Development Index (HDI) | Human Development Reports. (2021). Retrieved 24 May 2021, from http://hdr.undp.org/en/content/human-development-index-hdi Part C {data-navmenu="Conclusion"} ===================================== Row {data-width=600} ------------------------------------- ### ```{r img9, echo = F, out.width = '5%'} knitr::include_graphics("images/thanks.jpg") ``` ### ```{r img10, echo = F, out.width = '3%'} knitr::include_graphics("images/qu.jpg") ``` Column {data-height=300} ------------------------------------- ### **Credits** : - Xiaoyu Tian - Nishtha Arora - Shaohu Chen - Nurlaily Furqandari Suliana ### ```{r img11, echo = F, out.width = '0.1%'} knitr::include_graphics("images/mon.png") ```